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Abstract. We discuss a number of naturally arising problems in 
arithmetic, culled from completely unrelated sources, which turn 
out to have a common formulation involving "thin" orbits. These 
include the local-global problem for integral Apollonian gaskets 
and Zaremba's Conjecture on finite continued fractions with ab- 
solutely bounded partial quotients. Though these problems could 
have been posed by the ancient Greeks, recent progress comes from 
a pleasant synthesis of modern techniques from a variety of fields, 
including harmonic analysis, algebra, geometry, combinatorics, and 
dynamics. We describe the problems, partial progress, and some 
of the tools alluded to above. 
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1. Introduction 

In this article we will discuss recent developments on several seem- 
ingly unrelated arithmetic problems, which each boil down to the same 
issue of proving a "local-global principle for thin orbits". In each of 
these problems, we study the orbit 

o = r • vo, 

of some given vector vq G Z'^, under the action of some given group or 
semigroup, F, (under multiplication) of d-hy-d integer matrices. It will 
turn out that the orbits arising naturally in our problems are "thin"; 
roughly speaking, this means that each orbit is "degenerate" in its 
algebro-geometric closure, containing relatively very few points. 

Each of the problems then takes another vector wq G Z'', and for the 
standard inner product (•, •) on R*^, forms the set 

y (wo, O) CZ 

of integers, asking what numbers are in For an integer q > 1, the 
projection map 

Z Z/qZ 

can give an obvious obstruction to membership. Let ^(modg) be the 
image of this projection, 

^(modg) := {s(modg) : s e ^} C Z/qZ. 

For example, suppose that any number in ^ leaves a remainder of 1, 2 
or 3 when divided by 4, that is, S^{mod4) = {1,2,3}. Then one can 
conclude, without any further consideration, that 10^° ^ y, since 
10^° = 0(mod4). This is called a /oca/ obstruction. Call n admissible 
if it avoids all local obstructions, 

n e ^(modg), for all q>l. 

In many applications, the set S^(modq) is significantly easier to ana- 
lyze than the set y itself. But a local to global phenomenon predicts 
that, if n is admissible, then in fact n e thereby reducing the seem- 
ingly more difficult problem to the easier one. 

It is the combination of these concepts, (i) thin orbits, and (ii) local- 
global phenomena, which will turn out to be the "beef" of the problems 
we intend to discuss. 
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Figure 1. An integral Apollonian gasket. 

1.1. Outline. 

We begin in §2 with Zaremba's Conjecture. We will explain how 
this problem arose naturally in the study of "good lattice points" for 
quasi-Monte Carlo methods in mult i- dimensional numerical integra- 
tion, and how it also has applications to the linear congruential method 
for pseudo-random number generators. But the assertion of the con- 
jecture is a statement about continued fraction expansions of rational 
numbers, and as such is so elementary that Euclid himself could have 
posed it. We will discuss recent progress by Bourgain and the author, 
proving a density version of the conjecture. 

We change our focus in §3 to the ancient geometer Apollonius of 
Perga. As we will explain, his straight-edge and compass construction 
of tangent circles, when iterated ad infinitum, gives rise to a beautiful 
fractal circle packing in the plane, such as that shown in Figure 1. 
Recall that the curvature of a circle is just one over its radius. For 
special configurations, all the curvatures of circles in the given packing 
turn out to be integers; these are the numbers shown in Figure 1. We 
will present in §3 progress on the problem: which integers appear? 
It was recently proved by Bourgain and the author that almost every 
admissible number appears. 

In §4, also stemming from Greek mathematics, we describe a local- 
global problem for a thin orbit of Pythagorean triples, as will be defined 
there. This problem is a variant of the so-called Afiine Sieve, recently 
introduced by Bourgain, Gamburd, and Sarnak. We will explain an 
"almost" local-global theorem in this context due to Bourgain and the 
author. 



4 



ALEX KONTOROVICH 



Finally, these three problems are reformulated to the aforementioned 
common umbrella in §5, where some of the ingredients of the proofs 
are sketched. The problems do not naturally fit in an established 
area of research, having no L-functions or Hecke theory (though they 
are unquestionably problems about whole numbers), being not part 
of the Langlands Program (though involving automorphic forms and 
representations), nor falling under the purview of the classical circle 
method or sieve, which attempt to solve equations or produce primes 
in polynomials (here it is not polynomials that generate points, but 
the aforementioned matrix actions). Instead one must borrow bits 
and pieces from these fields and others. The major tools which we 
aim to highlight throughout include analysis (the circle method, expo- 
nential sum bounds, infinite volume spectral theory), algebra (strong 
approximation, Zariski density, spin and orthogonal groups associ- 
ated to quadratic forms, representation theory), geometry (hyperbolic 
manifolds, circle packings, diophantine approximation), combinatorics 
(sum-product, expander graphs, spectral gaps), and dynamics (ergodic 
theory, mixing rates, the thermodynamic formalism). 

1.2. Notation. 

We use the following standard notation. A quantity is defined via 
the symbol ":=", and a concept being defined is italicized. Write f ^ g 
for f/g ^ 1, / = o{g) for f/g 0, and / = 0{g) or / < ^ for / < 
Cg. Here C > is called an implied constant, and is absolute unless 
otherwise specified. Moreover, f g means f <^ g <^ f . We use e{x) = 
^2mx^ The cardinality of a finite set S is written as IS*! or ^S. The 
transpose of a vector v is written v*. The meaning of algebraic symbols 
can change from section to section; for example the (semi)group F and 
quadratic form Q will vary depending on the context. 
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(a) Multiplier b = 3523. 




(b) Multiplier b = 3535. 



Figure 2. Graphs of the map (2.2) with prime modulus 
d = 4547, and multipher b as shown. 



2. Zaremba's Conjecture 

Countless applications require pseudo-random numbers: determinis- 
tic algorithms which "behave randomly." Probably the simplest, old- 
est, and best known among these is the so-called linear congruential 
method: For some starting seed xq, iterate the map 

X 6x + c (mode?). (2.1) 

Here b is called the multiplier, c the shift, and d the modulus. For 
simplicity, we consider the homogeneous case c = 0. To have as long a 
sequence as possible, take d to be prime, and b a primitive root mod 
d, that is, a generator of the cyclic group (Z/dZ)^. In this case we 
may as well start with the seed Xq = 1; then the iterates of (2.1) are 
nothing more than the map 

n^6" (modrf). (2.2) 

We show graphs of this map in Figure 2 for the prime d = 4547, 
with two choices of roots b = 3523 and b = 3535. In both cases, the 
graphs "look" random, in that, given b and n, it is hard to guess where 
6" (mode?) will lie (without just computing). Similarly, given b and 
b'^{modd), it is typically difficult to determine n; this is the classical 
problem of computing a discrete logarithm. 

A slightly more rigorous statistical test for randomness is the serial 
correlation of pairs: how well can we guess where 6"^^ is, knowing 6"? 
To this end, we plot in Figure 3 these pairs, or what is the same, the 
pairs 

^ (mod 1) \ C MVZ^ (2.3) 



d d 



n=l 



in the unit square, with the previous choices of modulus and multiplier. 
Focus first on Figure 3a: it looks like a fantastically equidistributed 
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(a) Multiplier b = 3523. (b) Multiplier b = 3535. 

Figure 3. Plots of the points (2.3) for the same choices 
of modulus d = 4547 and multipliers as in Figure 2. 



grid. Keep in mind that the mesh in each coordinate is of size 1/d ^ 
1/4000, so we have (4000)^ points from which to choose, yet we are 
only plotting 4000 points, square-root the total number of options. 

On the other hand, look at Figure 3b: these parameters make a ter- 
rible random number generator! Knowing 6", we have about a 1 : 10 
chance of guessing 6"+^, not 1 : 4000. 

A related phenomenon also appears in two-dimensional numerical 
integration: Suppose that you wish to integrate a "nice" function / 
on MVZ^ ^ [0, 1) X [0, 1), say of bounded variation, V{f) < oo. The 
idea is to take a large sample of points Z in M^/Z^, and approximate 
the integral by the average of f{z), z E Z. For this to be a good 
approximation one obviously needs that / does not vary much in a 
small ball, and that the points of Z are well-distributed throughout 
M^/Z^. In fact, Koksma and Hlawka showed, rather beautifully, that 
this is all that one needs to take into account: 



/ / f{x,y)dxdy--^y2f{i 
Jo Jo \^\ 



< V^(/)-Disc(Z). 



Here Disc is the discrepancy of the set Z, defined as follows. Take a 
rectangle R = [a,b] x [c,d] C M^/Z^. One would like the fraction of 
points in R to be close to its area, so set 

#(zni?) 



Disc(Z) := sup 



#2 



Area(i?) 



It is elementary that for a growing family Z^'^^ C M^/Z^, \Z^''^\ — )■ oo, 
the discrepancy Disc(-Z^'^'') decays to if and only if Z^'^^ becomes 
equidistributed in M^/Z^. But the discrepancy itself is a finer measure 
of the rate of this decay. For example, observe that for any finite 
sample set Z, we have the lower bound Disc(Z) > l/\Z\. Indeed, 
take a family of rectangles R zooming in on a single point in Z; the 



FROM APOLLONIUS TO ZAREMBA 



7 



proportion of points in R is always while the area of R can be 

made arbitrarily small. It turns out there is a sharpest possible lower 
bound, due to Schmidt [Sch72]: 

1 1 2f I 

For any finite Z C S, Disc(Z) > (2.4) 

\Z\ 

For standard Monte Carlo integration, one often samples z & Z 
according to the uniform measure; the Central Limit Theorem then 
predicts that 

Disc(Z) ^ (2.5) 

ignoring log log factors. So comparing (2.5) to (2.4), it is clear that 
uniformly sampled sequences are far from optimal in numerical inte- 
gration. Alternatively, one could take Z to be an evenly spaced d-hj-d 
grid, 

Z = {{i/d,j/d):0<i,j <d}. 

But then the rectangle [e,l/d — e] x [0, 1] contains no grid points while 
its area is almost 1/d = 1/\Z\^^'^, again giving (2.5). 

In the qausi Monte Carlo method, rather than sampling uniformly, 
one tries to find a special sample set Z to come as close as possible 
to the optimal discrepancy (2.4). Ideally, such a set Z would also be 
quickly and easily constructible by a computer algorithm. Not surpris- 
ingly, the set Z illustrated in Figure 3a makes an excellent sample set. 
It was this problem which led Zaremba to his theorem and conjecture, 
described below. 

Returning to our initial discussion, observe that the sequence (2.3) 
is essentially (since 6 is a generator) the same as 

And this is nothing more than a graph of our first map (2.1). Now 
it is clear that both Figures 3a and 3b are "lines", but the first must 
be "close to a line with irrational slope," causing the equidistribution. 
This Diophantine property is best described in terms of continued frac- 
tions, as follows. 

For X G (0, 1), we use the notation 



X = [ai,a2, ...] 
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for the continued fraction expansion 



1 



X = 



1 



di + 



02 + 



The integers aj > 1 are called partial quotients of x. Rational numbers 
have finite continued fraction expansions. 

One is then immediately prompted to study the continued fraction 
expansions of the "slopes" b/d in Figure 3: 



Note the very large partial quotient 35 in the middle of the second 
expression, while the partial quotients in the first are all at most 3. 
Observations of this kind naturally led Zaremba to the following 

Theorem 2.7 (Zaremba 1966 [Zar66, Corollary 5.2]). Fix {b,d) = 1 
with b/d = [«!, 02, ... , ttk] and let A := maxa^. Then for Zfj^ given in 



Since |-2^b,d| = d, comparing (2.8) to (2.4) shows that the sequences 
(2.6) are essentially best possible, up to the "constant" A, cf. Figure 
3a. But the previous sentence is complete nonsense: A is not constant 
at all; it depends on d,^ cf. Figure 3b. 

With this motivation, Zaremba predicted that in fact A can be taken 
constant: 

Conjecture Z (Zaremba 1972 [Zar72, p. 76]). Every natural number 
is the denominator of a reduced fraction whose partial quotients are 
absolutely bounded. 

That is, there exists some absolute A > 1 so that for each d > 1, 
there is some (6, d) = 1, so that b/d = [ai, . . . , a^] with maxa^ < A. 

Zaremba even suggested a sufficient value for A, namely A = 5. So 
this is really a problem that could have been posed in Book VII of 
the Elements (after Euclid's algorithm): using the partial quotients 
aj G {1,...,5}, does the set of (reduced) fractions with expansion 
[ai, . . . ,ak] contain every integer as a denominator? The reason for 

^The value A also depends on b, but the important variable for applications is 



3523/4547 
3535/4547 



[1,3,2,3,1,2,3,: 
[1,3,2,35,1,1,1 



,2,1,3], 
1,4]. 



(2.6), 




(2.8) 
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Zaremba's guess 5 is simply that it is false for A = A, as we now 
explain. First some more notation. 

Let be the set of rationals with the desired property that all 
partial quotients are at most A: 

:= 1^ = [fli, • • • , Ofc] : {b, d) = 1, and aj < A, Vjj , 

and let be the set of denominators which arise: 

b 1 

d : 3{b,d) = 1 with -e^A>- 

Then Zaremba's conjecture is that ^5 = N, and we claim that this is 
false for ^4. Indeed, 6 ^ ^4: the only numerators to try are 1 and 5, 
but the continued fraction expansion of 1/6 is just [6], and 5/6 = [1, 5], 
so the largest partial quotient in both is too big. 

That said, there are only two other numbers, 54 and 150, known to 
be missing from ^4 (see [OEI]), leading one to ask what happens if a 
finite number of exceptions is permitted. Indeed, Niederreiter [Nie78, 
p. 990] conjectured in 1978 that for A = 3, ^3 already contains every 
sufficiently large number; we write this as 

S>3 D N»i. 

With lots more computational capacity and evidence, Hensley almost 
20 years later [Hen96] conjectured even more boldly that the same holds 
already for A = 2: 

^2 D N»i. (2.9) 

Lest the reader be tempted to one-up them all, let us consider the case 
A = 1. Here Mi contains only continued fractions of the form [1, . . . , 1], 
and these are quotients of consecutive Fibonacci numbers 

^1 = {FjFn+l}. 

So ^1 = {Fn} is just the Fibonacci numbers, and this is an exponen- 
tially thin sequence. 

In fact, Hensley conjectured something much stronger than (2.9). 
First some more notation. Let be the set of limit points of Ma, 

:= {[ai,a2,...] : aj < A, Vj}. 

This is a Cantor-like set with some Hausdorff dimension 



6a := dim(^A). 



(2.10) 
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^2^'^ = {[ai,...]:ai<2}: 



^2^'^ = {[ai,...]:ai,a2<2}: 

1 

Figure 4. The Cantor set ^2=0 where "^^^^ = 

k=l 

{[oi, . . . , Oj, . . . , Ofc, . . . ] : < A for all 1 < j < k} re- 
stricts only the first k partial quotients. 

To get our bearings, consider again the case A = 1. Then = {l/v^} 
is just the singleton consisting of the reciprocal of the golden mean, 
and hence (5i = 0. 

Now take A = 2. Consider the unit interval [0,1]. The numbers 
in the range (1/2,1] have first partial quotient ai = 1, and those in 
(1/3,1/2] have first partial quotient ai = 2. The remaining interval 
[0,1/3] has numbers whose first partial quotient is already too big, 
and thus is cut out. We repeat in this way, cutting out intervals for 
each partial quotient, and arriving at ^2; see Figure 4. There is a 
substantial literature estimating the dimension 82 which we will not 
survey, but the current record is due to Jenkinson-Pollicott [JPOl], 
whose superexponential algorithm estimates 

82 = 0.5312805062772051416244686 . . . (2.11) 

If we relax the bound A, the Cantor sets increase, as do their dimen- 
sions. In fact, Hensley [Hen92] determined the asymptotic expansion, 
which to first order is 




(2.12) 



as A — i- 00. In particular, the dimension can be made arbitrarily close 
to 1 by taking A large. 

We can now explain Hensley's stronger conjecture. His observation 
is that one need not only consider restricting the partial quotients aj 
to the full interval [1,^]; one can allow more flexibility by fixing any 



ai=l 



a2<2 



a2<2 



_5_ 4 1 
12 9 2 
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finite "alphabet" ^ C N, and restricting the partial quotients to the 
"letters" in this alphabet. To this end, let be the Cantor set 

"^A ■= {[oi, a2, . . . ] : flj G A, Vj > 1}, 

and similarly let ^a the partial convergents to ^a^ ^a the denom- 
inators of ^A: and 6a the Hausdorff dimension of ^a- Then Hensley's 
claim is the following 

Conjecture 2.13 (Hensley 1996 [Hen96, Conjecture 3, p. 16]). 

^^dN»i ^ Sa>1/2. (2.14) 

Observe in particular that 62 in (2.11) exceeds 1/2, and hence (2.14) 
implies that (2.9) holds. 

Here is some heuristic evidence in favor of (2.14). Let us visualize 
the set ^A of rationals, by grading each fraction according to the de- 
nominator. That is, plot each fraction b/d at height d, showing the 
set 

{(^, rf) : ^e^^, (6,rf) = l|. (2.15) 

We show this plot in Figure 5a for A = {1,2} truncated at height 
N = 10000, and in Figure 5b for A = {1, 2, 3, 4, 5} truncated at height 
N = 1000. We give a name to this truncation, defining 

^j^{N) ■= S^^e^A- {b,d) = 1, 1 < 6 < < . 

Observe first that the limiting vertical directions in which the figures 
grow are precisely the Cantor sets compare Figures 5a and 4. 
Moreover, note that if at least one point has been placed at height d, 
then d G ^a- That is, the "beef" of this problem boils down to: what 
is the projection of the plots in Figure 5 to the |/-axis? In particular, 
does every (sufficiently large) integer appear? 

The first question to address is: how big is \^j^{N)\, that is, how 
many points are being plotted in Figures 5a and 5b? Hensley [Hen89] 
showed that, as — j- 00, 

i^^AiN) X N^^^, (2.16) 

where the implied constant can depend on A. (Hensley proved this 
for the alphabet A = {1,2,.. .,^4}, but the same proof works for an 
arbitrary finite A.) 

Now, the =^ direction of (2.14) is trivial. Indeed, let 

^AiN) :=^^n[l,iV], 
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{a) A = 2, N = 10000. 
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0.2 0.4 0.6 0.8 1.0 
(b) ^ = 5, iV = 1000. 



Figure 5. For each b/d e ^a{N), plot h/d versus d, 
with A and truncation parameter N as shown. 



so that the left hand side of (2.14) is equivalent to 

#^^(A^) = A^ + 0(1), asA^-^oo. 



(2.17) 



Then it is clear that jj^^%js,{N) counts rf's with multiplicity, whereas 
#^^(iV) counts each appearing d only once; hence 



(2.18) 



So if (2.17) holds, then (2.18) implies that 25^ must be at least 1. 

A caveat: we do not know how to verify (2.17) for a single alphabet! 
Nevertheless the content of Hensley's Conjecture is clearly the opposite 
•^^= direction. Here is some evidence in favor of this claim. 



An old theorem of Marstrand's [Mar54] states the following. Let 
E C [0, 1] X [0, 1] be a Hausdorff measurable set having Hausdorff 
dimension a > 1. Then the projection of E into a line of slope tan^ is 
"large," for Lebesgue-almost every 9 G M/27rZ. Here "large" means of 
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positive Lebesgue measure. One may thus heuristically think of (2.15) 
as E above, with (2.16) suggesting the "dimension" a = 25^. Then 
is the projection of this E to the y-axis, and it should be "large" 
according to the analogy. Marstrand's theorem says nothing about an 
individual line, and does not apply to the countable set (2.15), so the 
analogy cannot be furthered in any meaningful way. Nevertheless, we 
see the condition a > 1 is converted into 26 > 1, giving evidence for 
the direction of (2.14). 

For another heuristic, if one uniformly samples A^^"^ pairs (6, d) out of 
the integers up to A^, a given d is expected to appear with multiplicity 
roughly A^^''"^. For 6 > 1/2 and growing, this multiplicity will be 
positive with probability tending to 1. 

This heuristic does not rule out the possible conspiracy that only very 
few (about A^^*^"^) rf's actually appear, each with very high (about A^) 
multiplicity. But such an argument leads to another bit of evidence 
towards (2.14): since the multiplicity of any d < N is ai most A^, we 
have the elementary lower bound 

#^^(iV) > ^#^^(iV) ^>>^ ^N''^ = N^'^-\ 

So if 5 J, > 1/2, then the set already grows at least at a power 
rate. Furthermore, for any fixed e > 0, one can take some A = A{e) 
sufficiently large so that 26_a — 1 > 1 — e. For example, using (2.12), 
we can take A = {1,2, . . . , A} where 

A > -^(1 + 0(1)). 

Here o(l) — )■ as £ — )■ 0. Hence one can produce A^^"^ points in ^_^{N), 
which is already substantial progress towards (2.17). 

But unfortunately, Hensley's conjecture (2.14), as stated, is false. 

Lemma 2.19 (Bourgain-K. 2011 [BKll, Lemma 1.19]). The alphabet 
A = {2,4,6,8,10} has dimension 5_4 = 0.517..., which exceeds 1/2, 
but does not contain every sufficiently large number. 

Proof. The dimension can be computed by the Jenkinson-Pollicott al- 
gorithm used to establish (2.11). It is an elementary calculation from 
the definitions to show for this alphabet that every fraction in is of 
the form 2m/(4n + l) or (4ra + l)/(2m), and so = {0, 1, 2}(mod4). 
Hence does not contain every sufficiently large number. □ 

That is, there can be congruence obstructions, in addition to the 
condition on dimension. This suggests instead a closer analogy with 
Hubert's 11th problem on numbers represented by quadratic forms. 
According to this analogy, we make the following 
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Definition 2.20. Call d represented hy the given alphabet Aifd & 
Also, call d admissible for the alphabet A if it is everywhere locally 
represented, meaning that d G ^^^(modg) for all g > 1. 

One can then modify Hensley's conjecture to state that, if 5^ exceeds 
1/2, then every sufficiently large admissible number is represented, akin 
to Hasse's local-to-global principle. 

Remark 2.21. We will explain in §2.2 that the alphabet ^ = {1,2} has 
no local obstructions, so (2.9) is still plausible as it stands. 

Here is some progress towards the conjecture. 

Theorem Z (Bourgain-K. 2011 [BKll]). Almost every natural number 
is the denominator of a reduced fraction whose partial quotients are 
bounded by 50. 

Here "almost every" is in the sense of density: for A = {1, 2, ... , 50}, 

l#(^^n[i,iV])^i, 

as — )■ oo. The proof in fact shows that for any alphabet A having 
sufficiently large dimension 

6a > So, (2.22) 

almost every admissible number is represented, where the value 

6o = l- 5/312 ^ 0.98 (2.23) 

is sufficient. Using Hensley's (2.12), the value A = 50 seems to satisfy 
(2.22). The reason Theorem Z needs no mention of admissibility is that 
any alphabet A with such a large dimension (2.23) must contain both 1 
and 2; missing even one of these letters will drop the dimension by too 
much. Hence there are actually no local obstructions in the theorem, 
cf. Remark 2.21. 

To explain the source of this progress, we reformulate Zaremba's 
problem in a way that highlights the role of the hitherto unmentioned 
"thin orbit" lurking underneath. 



2.1. Reformulation. 

The key to the above progress is the old and elementary observation 
that 

- = [ai, ■ ■ ■ ,ak\ 



FROM APOLLONIUS TO ZAREMBA 



15 



is equivalent to 

' * b \ f 1 \ f 1 



* d I \ 1 ai I \ 1 ttk 



(2.24) 



With this observation, it is natural to introduce the semigroup gener- 
ated by matrices of the above form with partial quotients restricted to 
the given alphabet. Let 

r = r^:= J I ) •^^^) ' ^2.25) 

where the "+" denotes generation as a semigroup (no inverse matrices). 
Then the orbit 

O = := r • vo (2.26) 

with 

vo = (0,l)* (2.27) 

isolates the set of second columns in F, and from (2.24) is hence in 
bijection with the set The "thinness" of the orbit is explained by 
(2.16), which implies that 

#{v e O : ||v|| < A^} X N^^^, 

as — 7- oo. If O consisted of all integer pairs {b,dy, the above count 
would be replaced by A^^, ignoring constants. So this is the reason 
we call O thin: it contains many fewer points than the ambient set in 
which it naturally sits. 

From (2.24) again, the set is nothing more than the set of bottom 
right entries of matrices in F^. This can be isolated via: 

(vo,0) = (vo,F-vo)=^^, (2.28) 

where the inner product is the standard one on M^. Thus d is repre- 
sented if and only if there is a 7 G F so that 

(i=(vo,7-vo), (2.29) 

with Vq given in (2.27). 

2.2. Local Obstructions. 

One can now easily understand Remark 2.21, and the source of any 
potential local obstructions. The key observation, via (2.28), is that 
to understand ^^(modg), one needs only to understand the reduction 
of F(modg). And the latter can be analyzed by some algebra, namely 
the so-called strong approximation property; see e.g. [Rapl2] for a 
comprehensive survery. This is a property which determines when the 
reduction mod q map is onto. For general algebraic groups this is a deep 
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theory, the first proof [MVW84] using the classification of finite simple 
groups. But for SL2, the proofs are elementary, see e.g. [DSV03]. 

First observe that F sits inside the integer points of the algebraic 
group GL2, meaning that any solution in Z to the polynomial equation 
{ad — bc)m = 1 gives an element (;!^) G GL2(Z), and vice-versa. Ac- 
tually GL2 does not have strong approximation, (e.g. the determinant 
in GL2(Z) can only be ±1, while in GL2(Z/5Z) it is 1, 2, 3 or 4, so the 
reduction map is not onto). So we first pass to SL2, as follows. The 
generators in (2.25) all have determinant —1, so the product of any 
two has determinant +1. We make these products the generators for a 
subsemigroup F of F, that is, set F := F fl SL2. We recover the original 
F-orbit O in (2.26) by a finite union of F-orbits. The limiting Cantor 
set and its Hausdorff dimension are unaffected. 

Then strong approximation says essentially that for p a sufficiently 
large prime, and q = any p power, the reduction of F mod q is all 
of SL2(Z/gZ). (It does not matter that F is only a semigroup; upon 
reduction mod q, it becomes a group.) Moreover for ramified primes p 
(those for which the reduction mod p is not onto), the reduction mod 
sufficiently large powers of p stabilizes after some finite power. This 
means that there is some Cq = eo(p, F) so that the following holds. 
For any e > eo, if M G SL2(Z/p'^Z) is such that its reduction is in 
F(modp^°), then M is also in F(modp^). (These statements are best 
made in the language of p-adic numbers, which we avoid here.) A 
key ingredient is that, while F is some strange subset of SL2(Z), it is 
nevertheless Zariski dense in SL2. This means that if P{a, b, c, d) is a 
polynomial which vanishes for every ^) G F, then P also vanishes 
on all matrices in SL2 with entries in C. 

In the above, "sufficiently large," both for primes p to be unrami- 
fied, and the stabilizing powers eo of ramified primes, can be effectively 
computed in terms of the generators. Then for an arbitrary modulus 
q = Pi^ ■ ■ ■Pk', the reduction mod q can be pieced together from those 
mod p^j^ using a type of Chinese Remainder Theorem groups called 
Goursat's Lemma. This leaves some finite group theory to determine 
completely the reduction of F mod any q, and hence explains all local 
obstructions via (2.28). 

We now leave Zaremba's problem, and return to sketch a proof of 
Theorem Z in 55. 
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(a) Three mutually (b) Two more tangent (c) Six more tangent 
tangent circles circles circles 

Figure 6. Tangent circles 



3. Integral Apollonian Gaskets 

ApoUonius of Perga (ca 262-190 BC) wrote a two-volume book on 
Tangencies, solving in every conceivable configuration the following 
general problem: Given three circles in the plane, any of which may 
have radius zero (a point) or infinity (a line), construct a circle tangent 
to the given ones. The volumes were lost but the statements survived 
via a survey of the work by Pappus. In the special case when the given 
three circles are themselves mutually tangent with disjoint points of 
tangency (Figure 6a), ApoUonius proved that 

there are exactly two solutions (3-1) 

to his problem (Figure 6b). Adding these new circles to the configu- 
ration, one has many other triples of tangent circles, and ApoUonius's 
construction can be applied to them (Figure 6c). Iterating in this way 
ad infinitum, as apparently was first done in Leibniz's notebook, gives 
rise to a circle packing, the closure of which has become known in the 
last century as an Apollonian gasket. We restrict our discussion hence- 
forth to bounded gaskets, such as that illustrated in Figure 1; there 
the number shown inside a circle is its curvature, that is, one over 
its radius. Such pictures have received considerable attention recently, 
see e.g. [LMW02, GLM+03, GLM+05, GLM+06a, GLM+06b, EL07, 
Sar07, Sar08, BGSIO, KOll, OhlO, BFll, Sarll, Fucll, FSll, 0S12, 
Vinl2, L012, BK12]. We will focus our discussion on the following two 
problems: 

(1) The Counting Problem: For a fixed gasket ^, how quickly do 
the circles shrink, or alternatively, how many circles are there 
in ^ with curvature bounded by a growing parameter T? 
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(2) The Local-Global Problem: Suppose ^ is furthermore integral, 
meaning that the curvatures of all circles in it are integers, such 
as the gasket in Figure 1. How many distinct integers appear 
up to a growing parameter A^? That is, count curvatures up to 
N, but without multiplicity. 

Problem (2) does not yet look like a local-global question, but will 
soon turn into one. We first address Problem (1) in more detail. 



3.1. The Counting Problem. 

3.1.1. Preliminaries. 

Some notation: for a typical circle C in a fixed bounded gasket ^, 
let r(C) be its radius and 

b{C) = l/r{C) 

its curvature (or "bend"). Let 

^f^(T) := #{C G ^ : 6(C) < T} (3.2) 

be the desired counting function. To study this quantity, one might 
introduce an "L-function" : 

^^(^)-E^ = E^(^r- (3-3) 

Since the sum of the areas of inside circles in ^ yields the area of the 
bounding circle, the series converges for 9^c(s) > 2. It has some 
abscissa of convergence S, meaning £<y converges for fHe(s) > 6 and 
diverges for 9^e(s) < 6. It follows from (3.3) that 6 is the Hausdorff 
dimension of the gasket ^ [Boy 73]. In fact, Apollonian gaskets are 
rigid, in the sense that one can be mapped to any other by Mobius 
transformations. The latter are conformal (angle preserving) motions 
of the complex plane, sending z i— )■ {az + b)/{cz + d), ad — bc = l. Hence 
5 is a universal constant; McMuUen [McM98] estimates that 

5 = 1.30568... (3.4) 

From such considerations, Boyd [Boy82] was able to conclude that 

— j 7^ > 

logT 

as T — i- oo. 

To refine this crude estimate to an asymptotic formula for M^{T), 
the author and Oh [KOll] established a "spectral interpretation" for 
proving: 

M<s{T)^c-T', (3.5) 
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Q / 



Figure 7. Generation from a root quadruple 

for some c = c(^) > 0, as T — oo. (This asymptotic was recently 
refined further in Vinogradov's thesis [Vinl2] and independently by 
Lee-Oh [L012], giving lower order error terms.) The remainder of 
this subsection is devoted to explaining this spectral interpretation and 
highlighting some of the ideas going into the proof of (3.5). 

3.1.2. Root quadruples and generation by reflection. 

It is easy to see [GLM+03, p. 14] that each such gasket ^ contains a 
root configuration C = C{W) := (Ci, 6*2,6*3, C4) of four largest mutually 
tangent circles in ^. Let 

Vo = Vo(^) = (6l,&2,&3,&4)* (3.6) 

with bj = b{Cj) be the root quadruple of corresponding curvatures. 
The bounding circle, being internally tangent to the others, is given 
opposite orientation to make all interiors disjoint; this is accounted 
for by giving it negative curvature. For example in Figure 1, the root 
quadruple is 

vo = (-10,18,23,27)*, (3.7) 

where the bounding circle has radius 1/10. 

Three tangent circles, say Ci,C2,C3 have three points of tangency, 
and determine a dual circle C4 passing through these points, see Fig- 
ure 7. Thus the root configuration C determines a dual configuration 
C = (Ci, C2, C3, C4) of four mutually tangent circles, orthogonal to 
those in C, see Figure 8. Reflection through C4 fixes Ci,C2, and C3, 
and sends C4 to C4, the other solution to Apollonius's problem (3.1), 
see Figure 7. Starting with the root configuration, repeated reflections 
through the dual circles give the whole circle packing. 
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Figure 8. Root and dual configurations 

3.1.3. Hyperbolic space and the group A. 

Following Poincare, we extend these circle reflections to the hyper- 
bohc upper half space, 

:= {(xi, X2, y) : xi, X2 G M, y > 0}, (3.8) 

replacing the action of the dual circle Cj by a reflection through a 
(hemi) sphere 5j whose equator is Cj (with j = 1,...,4). We abuse 
notation, writing Sj for both the hemisphere and the conformal map 
reflecting through 5j. The group 

A := {51,32,33,34) < Isom(H^), (3.9) 

generated by these reflections acts discretely on H^; it is a so-called 
Schottky group, in that the four generating spheres have disjoint inte- 
riors. 

The ^-orbit of any flxed base point po G has a limit set in the 
boundary dM.'^, which is easily seen to be the original gasket, see Figure 
9. A fundamental domain for an action is a region 

ncM^ (3.10) 

so that any point in can be sent to Q in an essentially unique way; 
for the action of A, one can take Q to be the exterior of the four 
hemispheres. To see this, observe that if a point p = {xi,X2,y) G 
is inside one of the spheres 3j, then its reflection 3j{p) is outside of 3j 
and has a strictly larger value. This does not guarantee that 3j{p) is 
outside all of the other spheres, but if it is inside some 3k, then reflection 
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Figure 9. Poincare extension: an ^-orbit in 



through Sk will again have even higher value. This procedure must 
halt after finitely many iterations, since the only limit points of A are 
in the boundary dM.^ where y = 0. And it halts only when the image is 
outside of the four geodesic hemispheres. Uniqueness follows since any 
reflection Sj takes a point in i7 to a point inside Sj, that is, not in Q. 

Two facts are evident from the above: first of all, A is geometri- 
cally finite, meaning it has a fundamental domain bounded by a finite 
number (here it is four) of geodesic^ hemispheres; on the other hand, 
A has infinite co-volume, that is, any fundamental domain has infinite 
volume with respect to the hyperbolic measure 

y~'^dxidx2dy 

in the coordinates (3.8). Note moreover that A has the structure of a 
Coxeter group, being free save the relations = / for the generators. 
It is also the symmetry group of all Mobius transformations fixing ^. 

3.1.4. Descartes' Circle Theorem and integral gaskets. 

Next we need an observation due to Descartes in the year 1643 
[DesOl, pp. 37-50] (though his proof had a gap [Cox68]), that a quadru- 
ple V = (61,62,^3,^4)* of signed curvatures of four mutually tangent 
circles lies on the cone 

Q(v) = 0, (3.11) 
where Q is the so-called "Descartes quadratic form" 

g(v) := 2 [hi + hl + hl + - (61 + 62 + 63 + Wf . (3.12) 

geodesic in hyperbolic space is a straight vertical line or a semicircle orthog- 
onal to the boundary SH"^. 
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By a real linear change of variables, Q can be diagonalized to the form 

.2 



2 I 2 I 2 

X + y + z 



w 



that is, it has signature (3,1). Arguably the most beautiful formula- 
tion of Descartes' Theorem (rediscovered on many separate occasions) 
is the following excerpt from Soddy's 1936 Nature poem [Sod36]: 

Four circles to the kissing come. / The smaller are the bender. / 

The bend is just the inverse of / The distance from the center. / 

Though their intrigue left Euclid dumb / There's now no need for rule of thumb. / 

Since zero bend's a dead straight line / And concave bends have minus sign, / 

The sum of the squares of all four bends / Is half the square of their sum. 

If bi, 62 and 63 are given, then (3.11) is a quadratic equation in 64 with 
two solutions, 64 and 64, say; this is an algebraic proof of ApoUonius's 
theorem (3.1). It is then an elementary exercise to see that 

64 + &l = 2(6i + 62 + &3)- 

In other words, if the quadruple (61, 62, &3, ^4)* is given, then one obtains 
the quadruple with 64 replaced by 64 via a linear action: 

/I \ Ai\ Ai\ 



V2 



1 
2 



V 



&2 

h 



b2 



Hence we have given an algebraic realization to the geometric action 
of C4 (or S4) on the root quadruple, see again Figure 7. Call the above 
4x4 matrix 5*4. Of course one could also send other bj to b'j keeping 
the three complementary curvatures fixed, via the matrices 



/-I 2 2 2\ 



^1 



V 



So 



V 



/I 

2 

V 



-1 



2 
1 



V 



/I 



v 



2 

V 



(3.13) 

Moreover one can iterate these actions, so we introduce the so-called 
Apollonian group F, isomorphic to A, generated by the 5*^: 

T:={S,, 82,83,8,). (3.14) 

Then the orbit 

C»:=F-vo (3.15) 

of the root quadruple vq under the Apollonian group F consists of all 
quadruples corresponding to curvatures of four mutually tangent circles 
in the gasket We can now explain the integrality of all curvatures in 
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Figure 1: the group T has only integer matrices, so if the root quadruple 
vq (or for that matter any curvatures of four mutually tangent circles 
in ^) is integral, then all curvatures in ^ are integers! This fact seems 
to have been first observed by Soddy [Sod37]. 

3.1.5. Reformulating the counting statement, and the thin orbit. 

Moreover, note that starting with vq, any new circle generated by 
a reflection is the smallest in its configuration, and hence has largest 
curvature. That is, for v = 7- vq G C, writing 7 G F as a reduced word 
in the generators 7 = S'j^. ■ ■ ■ S'j^ , the last multiphcation by Si^. changes 
one entry, which is the largest entry in v. Hence, setting ||v||oo to be 
the max-norm, and for T large, we can rewrite A/g?(T) in (3.2) as 

^^{T) = 4 + # {v G O : V ^ vo, ||v|U < T} . (3.16) 

Here the first "4" accounts for the root quadruple vq. 

We have thus converted the circle counting problem into something 
seemingly more tractable: the counting problem for a F-orbit. That 
said, we clearly need a better understanding of the group F. Returning 
to the Descartes form Q in (3.12), we have by construction (and one 
can check directly) that for each j = 1, . . . , 4, 

g(s,-v) = g(v), 

for any v G M^. That is, each generator 5*^ lies in the so-called orthog- 
onal group preserving the quadratic form Q, 

OQ■.= {geGU■.Q{g■^r) = Q{^r), Vv} . 

Hence F also sits inside Oq, and moreover inside Oq(Z), the group of 
matrices in Oq with integer entries. The latter is a well understood 
algebraic group, again meaning that any solution to a certain set of 
polynomial equations gives an element in Oq, and vice- versa. But 
F is quite a mysterious group, in particular having infinite index in 
Oq{'L) (this fact is equivalent to A having infinite co- volume). It is also 
worth noting here that the general membership problem in a group is 
known to be undecidable [Nov55] , so presenting a matrix group via its 
generators leaves much to be desired.'^ 

Just as in Zaremba's problem, we can now again call this orbit O 
thin; indeed, for the counting problem with F replaced by the full 



That said, for our particular group T, one can use a reduction algorithm to root 
quadruples to determine membership. 
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Oq(Z), standard arguments in automorphic forms or ergodic theory 
[DRS93, EM93] show that 

#{v G Oq(Z) -vo : ||v|U < T} ~ cT^ as T ->cx), (3.17) 

for some c > 0. So comparing (3.17) to (3.16), (3.5) and (3.4), where 
the power drops from to with 5 < 2, we see that the F orbit is 
quite degenerate, having many fewer points. 



3.1.6. Sketch of the counting statement. 

Finally, we explain the aforementioned spectral interpretation, by 
first giving an analogous elementary example of a counting statement 
in another discrete group: the integers. Let us spectrally count the 
number of integers of size at most T: 

Uj^iT) ■= #{n e Z : |n| < T}. 

Of course this is a trivial problem, 

A/i(T) = 2T±l, (3.18) 

but it will be instructive to analyze it by harmonic analysis. To this 
end, let 

f{x) := l{H<i}, 
where is the indicator function. Scale / to 

frix) := f{x/T) = 1{|x|<t}, 
and periodize it with respect to the discrete group Z: 

Ft{x) :=^/r(n + x). (3.19) 

Then we have 

Ft{0) = M\n\<T} = Afz{T). (3.20) 

neZ 

By construction, Ft{x) = Ft{x + 1), that is, it takes values on the 
circle X := Z\]R, and is square-integrable. Ft G L^(X). The Laplace 
operator 

A := — div o grad = — 

on smooth functions can be extended to act on the whole Hilbert space 
L^(X) and is self-adjoint and positive definite (by our choice of sign) 
with respect to the standard inner product 

(F, G)= [ F{x)G{x)dx. 
Jx 
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(Proof: partial integration.) Its spectrum Spec(A) is just the set of its 
eigenvalues, with multiplicity. Elementary Fourier analysis shows that 
eigenf unctions of A invariant under Z-translations are scalar multiples 
of 

for m G Z. This function has Laplace eigenvalue 

Am = +47r^m^, 

and these fully span the spectrum (they have multiplicity two, except 
when m = 0). Expanding spectrally gives 

Ft{x)= J2 {FT,^m)^m{x), (3.21) 
A„eSpec(A) 

where equality is in the L^-sense. (Note that the are already scaled 
to have unit L^-norm.) The bottom of the spectrum Aq = corresponds 
to the constant function ^po{x) = 1, and contributes the entire "main 
term" in (3.18) to (3.21): 



(/ y^fT{n + x)-ldx] 



{Ft,^o)-^o= / yjT{n + x)-ldx] ■1=T / f{x)dx = 2T, 



after inserting (3.19), a change of variables, and "unfolding" J^^^ J2z 
just Jjg. That said, the equality (3.21) is in the sense, not pointwise 
(we cannot evaluate (3.21) at the point x = 0, as needed in (3.20)), 
and moreover the rest of the spectrum in (3.21), if bounded in absolute 
value, does not converge. But there are standard methods (smoothing 
and later unsmoothing) which overcome these technical irritants. 

A version of the above works with the Apollonian group F in place 
of Z, once one overcomes a number of further technical obstructions. 
The reader may wish to omit the following paragraph on the first pass; 
it is not essential to the sequel. 

We now need non-abelian harmonic analysis on the space L'^{X) with 

X := A\m\ 

the hyperbolic 3- fold in Figure 9. The (positive definite) hyperbolic 
Laplacian is 

^ \dxl dxl dy'^ J ^ dy 

in the coordinates (3.8). The spectrum in this setting, as studied 
by Lax-Phillips [LP82], has both continuous and discrete components 
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(though only a finite number of the latter). As X has infinite vol- 
ume, the constant function is no longer square-integrable, and the bot- 
tom eigenvalue Aq is strictly positive. A beautiful result in Patterson- 
Sullivan theory [Pat 76, Sul84] relates this eigenvalue to the Hausdorff 
dimension of the limiting gasket ^, namely 

\o = 5{2-5). 

The corresponding base eigenfunction ifo replaces the role of the con- 
stant function. Here we have used crucially that A is geometrically 
finite, and that 6 > 1, see (3.4). Even this is insufficient: because of 
the non- Euclidean norm || ■ ||oo in (3.16), one must work not on X but 
its unit tangent bundle Y := T^{X). And moreover we do not know 
how to handle the continuous spectrum directly, applying instead gen- 
eral results in the representation theory of semisimple groups about 
ergodic properties of flows on Y. At this point, we will not say more 
about the proof, inviting the interested reader to consult the original 
references [KOll, Vinl2, L012]. 

3.2. The Local-Global Problem. 

Assume now that ^ is not only bounded but also integral (recall that 
this means it has only integers for curvatures). Shrinking the gasket by 
a factor of two will double all of its curvatures, making them all even. 
So we should rescale an integral gasket to make it primitive, meaning 
there is no number other than ±1 dividing all of the curvatures. In 
fact, all of the salient features of the problem persist if we fix ^ to 
be the packing shown in Figure 1, and we do so henceforth. Recall 
that the problem we wish to now address is: How many curvatures are 
there up to some parameter A^, counting without multiplicity, that is, 
counting only distinct curvatures? 

First some more notation: let J3§ = ^{^) be the set of all curvatures 
of circles C in the gasket ^, 

^ := {n G Z : 3C G ^ with h{C) = n}, 

and call n represented if n G Staring at Figure 1 for a moment or 
two, one might observe that every curvature in our ^ is 

= 2, 3, 6, 11, 14, 15, 18, or 23 (mod 24). (3.22) 

These are the local obstructions for ^, and we call = ^ (^) the set 
of all admissible numbers n satisfying (3.22). In general, one calls n 
admissible if, as before, it is everywhere locally represented. 



n G e^(modg), Vg > 1. 



(3.23) 
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It cannot be the case that = SS^ since, for example, n = 15 is 
admissible, but a circle of radius 1/15 does not appear in our gasket. 
Nevertheless, as in Zaremba's problem, we have the following 

Conjecture A. Every sufficiently large admissible number is the cur- 
vature of some circle in ^ . 

This conjecture is stated by Graham-Lagarias-Mallows-Wilks-Yan 
[GLM+OS, p. 37], in the first of a lovely series of papers Apollonian 
gaskets and generalizations. They observe empirically that congruence 
obstructions for any integral gasket seem to be to the modulus 24, and 
this is completely clarified (as we explain below) by Fuchs [Fucll] in her 
thesis. Further convincing numerical evidence towards the conjecture 
is given in Fuchs-Sanden [FSll]. Here is some recent progress. 

Theorem A (Bourgain-K. 2012 [BK12]). Almost every admissible num- 
ber is the curvature of some circle in W . 

Again, "almost every" is in the sense of density, that 

#Kn[i,A])^'' ^'-'^^ 

as A ^ 00. It follows from (3.22) that for A large, #(^/ fl [1, A]) is 
about A/3 (there are 8 admissible residue classes mod 24), so (3.24) is 
equivalent to 

#(^n[i,A])~^. 

Some history on this problem: Graham et al [GLM"'"03] already made 
the first progress, proving that 

#(^n[l,A]) > A^/^ (3.25) 

Then Sarnak [Sar07] showed 

#(^n[l,A])»-£=, (3.26) 

before Bourgain- Fuchs [BFll] settled the so-called Positive Density 
Conjecture, that 

#(e^^n [1,A]) > A. (3.27) 
A key observation in the proof of Theorem A is that the problem is 
nearly identical to Zaremba's, in the following sense. Recall from (3.15) 
that the orbit (9 = F-vq of the root quadruple vq under the Apollonian 
group F contains all quadruples of curvatures, and in particular its 
entries consist of all curvatures in ^. Hence the set ^ of all curvatures 
is simply the finite union of sets of the form 

(wo,0) = (wo,F-vo), (3.28) 
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as Wo ranges through the standard basis vectors ei = (1, 0, 0, 0)*, . . . , 
64 = (0,0,0, 1)*, each picking off one entry of O. A heuristic analogy 
between Zaremba and the Apollonian problem is actually already given 
in [GLM+03, p. 37], but it is crucial for us that both problems are 
exactly of the form (3.28); compare to (2.28). That is, n is represented 
if and only if there is a 7 in the Apollonian group F and some wq G 
{ei, . . . , 64} so that 

n = (wo,7 ■ vo) . (3.29) 
Before saying more about the proof of Theorem A, we first discuss 
admissibility in greater detail. 

3.2.1. Local obstructions. 

Through (3.28), the admissibility condition (3.23) is again reduced 
to the study of the reduction of T modulo q. An important feature 
here is that, like in the Zaremba case, the group F is Zariski dense in 
Oq. Recall that this means: if P{'~f) is a polynomial in the entries of a 
4x4 matrix 7 which vanishes for every 7 G F, then P also vanishes on 
all complex matrices in Oq. 

We would hke again to exploit strong approximation, but neither Oq 
nor its the orientation preserving subgroup SOq := Oq fl SL4 have this 
property (being not even connected). But there is a standard method 
of applying strong approximation anyway, by first passing to a certain 
cover, as we now describe. 

From the theory of rational quadratic forms [Cas78], special orthog- 
onal groups are covered by so-called spin groups, and it is a pleasant 
accident that, since Q has signature (3, 1), the spin group of SOq(]R) is 
isomorphic to SL2(C); let us explain this covering map. The formulae 
are nicer if we first change variables (over Q) from the quadratic form 
Q to the equivalent form 

Q{x, y, z, w) := xw + y'^ + z"^ . 

Observe that the matrix 

—X y + iz 
y — iz w 



M 



has determinant equal to —Q and is Hermitian, that is, fixed under 
transpose-conjugation. The group SL2(C), consisting of 2 x 2 complex 
matrices of determinant one, acts on M by 

SU{C)3g:M^g.M.-g'=:M'=(^ ^r^'^^, ^' ^J'' ^ , 

with M' also Hermitian and of determinant —Q. Then it is easy to 
see that {x' ,y' , z' ,w'Y is a linear change of variables from {x,y,z,wy, 
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via left multiplication by a matrix whose entries are quadratic in the 
entries of g. Explicitly, if 

9=\ ^ , ' (3-30) 



c + 7i d + 5i 

then the change of variables matrix is 

/ a? + 2{ac + 07) 2{ca — 07) — — 7^ 

1 ab + af3 be + ad + f3'y + a6 da + cf3 — b'y — a6 —cd — 7^ 

I det((7)P — ba —da + c/3 — 67 + a5 —6c + ad — f3'y + a6 dj — c6 

\ -52-/32 ^2{bd + (36) 2{b5-d(5) d^ + 5^ 

(3.31) 

Let p be the (rational) map from SL2(C) to GL4(]R), sending (3.30) to 
(3.31); then by construction (again one can verify directly) the image 
is in SOq(]R). (Some minor technical points: Being quadratic in the 
entries, p is a double cover, with ±J having the same image. Moreover, 
SL2(C) is connected while SOg(]R) has two connected components, so 
p only maps onto the identity component.) Then changing variables 
from Q back to the Descartes form Q by a conjugation, one gets the 
desired map 

p : SL2(C) ^ SOq(M). 

It is straightforward then to compute the pullback of F fl SOq under 
p (see [GLM+05, Fucll]), the answer being the following 

Lemma 3.32. There is^ a homomorphism p : SL2(C) — t- SOq(]R) so 
that the group T := p~^(r fl SOg) sits in SL2(Z[i]) and is generated by 

Moreover, recalling the generators Sj forV in (3.13), one can arrange 
p so that p : (of) ^ S2S3, and p : (21) ^ 

In fact, we have just realized a conjugate of the group A (or rather 
its index-two orientation preserving subgroup) explicitly in terms of 
matrices in PSL(2,C) = Isom+(M3). 

From here, one follows the strategy outlined in §2.2. Using strong ap- 
proximation for SL2(Z[i]) (one considers reduction mod principal ideals 
(g)), Goursat's Lemma, some finite group theory, and other ingredi- 
ents, Fuchs [Fucll] was able to determine completely the reduction of 
F modulo any q, and hence explain all local obstructions. The answer 



^And one can easily write it down explicitly: it is a conjugate of (3.31), but 
much messier and not particularly enlightening. We spare the reader. 
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is that all primes other than 2 and 3 are unramified, meaning, as in 
§2.2, that for (g, 6) = 1, 

rnSOQ(modg) = SOQ(Z/gZ). 

Recall again that the right hand side above is a well-understood al- 
gebraic group. And moreover, the prime 2 stabilizes (with the same 
meaning as §2.2) at the power eo(2) = 3, that is at 8, and the prime 3 
stabilizes immediately at eo(3) = 1. Then r(mod24) is some explicit 
finite group, and looking at all the values of (3.28) for the given root 
quadruple vo(^), one immediately sees all admissible residue classes. 



3.2.2. Partial Progress. 

Lemma 3.32 can already be quite useful, in particular, it easily im- 
plies (3.25) and (3.26), as follows. 

The Apollonian group F contains the matrix 5453, which by Lemma 
3.32 is the image under p of (21)- The latter (and hence the former) 
is a unipotent matrix, meaning that all its eigenvalues are equal to 1. 
These have the important property that they grow only polynomially 

under exponentiation; in particular, (21) =(2^1)' ^^'^ '^'^^ "^^^ check 
directly from (3.13) that 

/ 1 \ 

1 

P4^3j - 4p_2A; 4P-2A; I -2k 2k 

\ 4A;2 + 2k 4A;2 + 2k -2k 2k + 1 J 

Put the above matrix into (3.29) with the root quadruple vq for our 
fixed gasket from (3.7), and take wq = 64, say. Then for any A; G Z, 
the number 

(e4 , (^4^3)'' ■ vo> = 32A;^ + 24k + 27 (3.34) 
is represented. That is, the set of represented numbers contains the 
values of this quadratic polynomial. From this observation, made in 
[GLM+03], it is immediate that (3.25) holds. Geometrically, these cur- 
vatures correspond to circles in the packing tangent to Ci and C2, since 
these are untouched by the corresponding refiections through C4 and 
C3. For example, the values k = —2, —1, 0, 1, 2 in (3.34) give curvatures 
107, 35, 27, 83, 203, respectively. These are visible in Figure 1; they are 
all tangent to the circles of curvature —10 (the bounding circle) and 18, 
skipping every other such circle. Using wq = 63 instead of 64 in (3.34) 
gives the polynomial 32A;^ — 8k + 23, the values of which correspond to 
the skipped circles. 
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To prove (3.26), we make the following observation, due to Sarnak 
[Sar07]. It is well known that the matrices ±(2 ?) and ±(q f ) (which 
map under p to S^S^ and 828^, respectively) generate the group 

AP)-{(: S)-SMZ):-ff«}. 

This is the so-called level-2 principal congruence subgroup of SL2(Z). 
Hence by Lemma 3.32, the group F contains 

S:=(52S3, S,S,) = p{m). (3.35) 

The point is that A(2) is arithmetic: for any integer ^ coprime to 2k, 
there is a matrix (^2k\^ ^(2)- One can work out, with the same vq 
and Wo as above, that 

64 , p (^*^ . = 32A;2 + 24H + 17 f + 10. (3.36) 

For example, the choices (2A;,£) = (4, -3), (2, -1), (4, -1), and (6, -1) 
give curvatures 147, 35, 107, and 243, respectively, visible up the left 
side of Figure 1, all tangent to the bounding circle (since S in (3.35) 
fixes Ci). Observe also that setting £ = 1 in (3.36) recovers (3.34). 
So, as observed by Sarnak [Sar07], the set ^ of represented numbers 
contains all primitive (meaning with 2k and C. coprime) values of the 
shifted binary quadratic form in (3.36). Note that the quadratic form 
has discriminant 24^ — 4 • 32 • 17 = —1600, and so (3.36) is definite, 
taking only positive values. The number of distinct primitive values of 
(3.36) up to was determined by Landau [Lan08]: it is asymptotic 
to a constant times N/ i/log N , thereby proving (3.26). A much more 
delicate and clever but still "elementary" (no automorphic forms are 
harmed) argument goes into the proof of (3.27), using an ensemble of 
such shifted binary quadratic forms. For Theorem A, one needs the 
theory of automorphic representations for the full Apollonian group, 
as hinted to at the end of §3.1.6. 

We now leave the discussion of the Apollonian problem, returning to 
it again in §5. 
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(a) View from the side (b) View from below 



Figure 10. The thin Pythagorean orbit O in (4.9). 
Points are marked according to whether the hypotenuse 
is prime (•) or composite (w). 

4. The Thin Pythagorean Problem 

A Pythagorean triple x = (x, y, zY is a point on the cone 

Q(x) = 0, (4.1) 

where Q is the "Pythagorean quadratic form" 

g(x) ■.= x' + y^-z\ 

Throughout we consider only integral triples, x G Z^, and assume that 
X, y, and z are coprime; such a triple is called primitive. Elementary 
considerations then force the hypotenuse z to be odd, and x and y to 
be of opposite parity; we assume henceforth that x is odd and y is even. 
The cone has a singularity at the origin, so we only consider its top 
half, assuming subsequently that the hypotenuse is positive, z > 0. 

Diophantus (and likely the Babylonians [Pli], who preceded him by 
about as much as he precedes us) knew how to parametrize Pythagorean 
triples: Given x, there is a pair v = {u, v) of coprime integers of oppo- 
site parity so that 

2 2 
X = U — V 

y = 2uv (4.2) 
z = u^ + v"^. 

That the converse is true is elementary algebra: any such pair v in- 
serted into (4.2) gives rise to a triple x satisfying (4.1). For example, 
it is easy to see that the triple 

xo = (3,4,5)* (4.3) 



corresponds to the pair 

vo = (2, 1)*. 



(4.4) 
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4.1. Orbits and the Spin Representation. 

As in the Apollonian case, the Pythagorean form Q has a special 
(determinant one) orthogonal group preserving it: 

SOq := {g e SL3 : Q{g ■ x) = g(x)}. (4.5) 

And as before, this group is also better understood by passing to its 
spin cover. Since the Pythagorean form Q has signature (2, 1), there is 
an accidental isomorphism between its spin group and SL2(M), given 
explicitly as follows. 

Observe that SL2 acts on a pair v by left multiplication; via (4.2), 
this action then extends to a linear action on x. In coordinates, it is 
an elementary computation that the action of ( " ^ ) on v corresponds 
to left multiplication on x by 

^ / lia^-b^-c^ + (P) ac-bd \ (a2 - 52 + - d^) \ 

ab — cd bc + ad ab + cd 

^"^-^"^ \l(a^ + b^-c^-d^) ac + bd l(a^ + b^ + c^ + d^) J 

(4.6) 

One can check directly from the definition (4.5) that (4.6) is an element 
of SOq, and hence we have explicitly constructed the spin homomor- 
phism 

p : SU{R) ^ SOq(M) : ^ J ^ H> (4.6). 

Given a Pythagorean triple xq, such as that in (4.3), the group F : = 
SOq(Z) of all integ'er matrices in SOg acts by left multiplication, giving 
the full orbit (9 = F ■ xq of all Pythagorean triples (with our convention 
that z > 0, X is even, and y is odd). 

Via (4.2) again, this SOq action on x is equivalent to the SL2 action 
on V. For a primitive v e Z^, both the integrality and primitivity are 
preserved by restricting the action to just the integral matrices SL2(Z). 
Moreover, one should preserve the parity condition on v by restricting 
further to only the principal 2-congruence subgroup 

A(2) = |7GSL2(Z):7 = /(mod2)| = (|± , ± ^ , 

which already appeared in §3.2.2. One can check directly that the im- 
age (4.6) of any 7 G A(2) is an integral matrix, that is, in SOq(Z). 
For vq corresponding to xq, the orbit (9 := F ■ vq under the full group 
F := A(2) consists of all coprime {u,v) with u even and v odd. 
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Prompted by the Affine Sieve^ [BGS06, BGSIO, SGSll] one may 
wish to study thin orbits O of Pythagorean triples. Here one replaces 
the full group SOq(Z) by some finitely generated subgroup F of infinite 
index. Equivalently one can consider an orbit O of vq under an infinite 
index subgroup F of A(2). We illustrate the general theory via the 
following concrete example. 

We first give a sample O orbit: in comparison with the generators 
of A (2), let F be the group generated by the following two matrices 

This group clearly sits inside A (2) but it is not immediately obvious 
whether it is of finite or infinite index; as we will show later, the index 
is infinite. Taking the base pair vq in (4.4), we form the orbit 

d:=t- Vq. (4.8) 

Correspondingly, we can take the base triple xq in (4.3), and form the 
orbit 

C := F ■ xo (4.9) 

of xo under the group 

F:=(Mi,M2), (4.10) 

where Mi and M2 are the images under p of the matrices generating 
F; one can elementarily compute from (4.7) and (4.6) that 



Ml := 2 1 2 , M2 := -4 1 4 . (4.11 





Figure 10 illustrates this orbit O. We can visually verify that the orbit 
looks thin, and in the next subsection we confirm this rigorously. 

4.2. The Orbit is Thin. 

The group SL2(M) also acts on the hyperbolic upper half-plane 

M := {z = X + iy : X eR,y > 0} 

by fractional linear transformations, 

a b \ az + b 

, ] : z^ -. (4.12) 

c d I cz + d ^ ' 



^We have insufficient room to survey this beautiful theory, for which the reader 
is directed to any number of exceUent surveys, e.g. [SG12]. 
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The action of our group F in (4.7) on EI has a fundamental domain 
(the definition is similar to (3.10)) given by 

{zeM: \mt{z)\ < 1, \z- 1/4| > 1/4, \z + 1/4| > 1/4}, 

where the distances above are Euclidean; see Figure 11a. The hyper- 
bolic measure is y~'^ dx dy, and hence this region again has infinite 
hyperbolic area. Equivalently, the index of F in A(2) is infinite, as 
claimed. 

Any orbit of a fixed base point in EI under F has some limit set 
^ = ^(F) in the boundary dM.. A piece of this Cantor-like set can 
already be seen in Figure 11a. But to see it fully, we show in Figure 
lib the same F-orbit in the disk model 

D = G C : |2| < 1}, 

by composing the action of F with the map 

m^B: zh^ 

z + i 

(which encodes the observation that points in the upper half plane are 
closer to i than they are to —i). In the disk model, one more clearly 
sees the limit set as the set of "directions" in which the orbit O can 
grow - juxtapose Figure 10b on Figure lib. This limit set ^ has some 
Hausdorff dimension 6 = 6{T) G [0, 1]; one can estimate 

5^0.59... (4.13) 

This dimension (also called the "critical exponent of F") is again an 
important geometric invariant, measuring the "thinness" of F, as illus- 
trated in the following counting statement [Kon07, Kon09, K012]. Let 
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||x|| be the Euclidean norm. There is some c > so that 

#{x G C : ||x|| < A^} ~ cAT^, as ^ oo. (4.14) 

Once again, (4.14) should be compared with the orbit of xq under 
the full ambient group, SOq(Z). Elementary methods show that 

#{xgSOq(Z)-Xo: ||x|| <iV}~c at. 

So in passing from the full orbit to O, the asymptotic drops from A^ to 
A^*^, with 6 < 1. Thus the orbit O is thin. 

The fact that p is a quadratic map in the entries (see (4.6)) implies 
that the count (4.14) on triples x G O is equivalent to the following 
asymptotic for the pairs v G (9: 

#{vG(!): ||v|| < A^}~c'-A^^^ (4.15) 

as A^ — > oo. Note that the power of A^ is now 26. This can also be seen 
immediately from (4.1) and (4.2) that 

||x|| = + y'^ + z^ = V2z = V2{u'^ + v^) = V2||vf . (4.16) 

(Geometrically, the cone (4.1) intersects the sphere of radius A^ at a 
circle of radius N/\/2.) Observe that (4.14) looks like the Apollonian 
asymptotic (3.5), while (4.15) is more similar to Hensley's estimate 
(2.16) in Zaremba's problem. This is just a consequence of choosing 
between working in the orthogonal group or its spin cover. 

4.3. Diophantine Problems. 

One can now pose a variety of Diophantine questions about the val- 
ues of various functions on such thin orbits. Given an orbit (9 = F ■ xq 
and a function / : (9 — t- Z, call 

^ := f{0) C Z (4.17) 

the set of represented numbers. That is, n is represented by the pair 
(C, /) if there is some 7 G F so that n = f{pf ■ xq). And as before, we 
say n is admissible if n G ^(modg) for all q. For example, if / is the 
"hypotenuse" function, /(x) = 2;, one can ask whether {0,f) repre- 
sents infinitely many admissible primes. Evidence to the affirmative is 
illustrated in Figure 10, where a triple is highlighted if its hypotenuse 
is prime. Unfortunately this problem on thin orbits^ seems out of reach 
of current technology. 

''For the full orbit of all Pythagorean triples, infinitely many hypotenuses are 
prime. This follows from (4.2) that z = + v'^ and Fermat's theorem that all 
primes = l(mod4) are sums of two squares. 
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But for a restricted class T of functions /, and orbits O which are 
"not too thin," recent progress has been made towards the local-global 
problem in 0^ . Let T be the set of functions / which are a linear, not 
on the triples x, but on the corresponding pairs v. For example, it 
is not particularly well-know that in a Pythagorean triple, the sum of 
the hypotenuse z and the even side y is always a perfect square. This 
follows immediately from (4.2); in particular, ?/ + 2; = (m + f )^. So the 
function 

/(x) = y^T^ = M + (4.18) 

is integer- valued on O and linear^ in v. 

Another way of saying this is to pass to the corresponding orbit 
(9 = r ■ vq. Any such linear function on v is of the form 

/(v) = (wo,v), (4.19) 

for some fixed wq G 1? . In the example (4.18), take wq = (1, 1)*. Then 
T consists of all functions on O which, pulled back to (9, are of the 
form (4.19). 

Theorem P (Bourgain-K. 2010 [BKIO]). Fix any f e T and let ^ 

be the set of represented numbers as in (4.17). Assume that the orbit 
(9 = r ■ xq is not too thin, in that the exponent of T is sufficiently large 

S > 60, (4.20) 

for some 6q < 1. (The value 60 = 0.99995 suffices.) Then almost every 
admissible number is represented. 

We are finally in position to relate this Pythagorean problem to the 
Apollonian and Zaremba's. Indeed, passing to the corresponding orbit 
(9 = r ■ vo and fixing the function /(v) = (wo,v), we have that n is 
represented if there is a 7 G F so that 

n=(wo,7-vo). (4.21) 

That is, 

^ = (wo,f-vo), (4.22) 

which is of the same form as (2.28) and (3.28). The condition of ad- 
missibility is analyzed again given the generators of F by strong ap- 
proximation, Goursat's Lemma, and finite group theory, as in §2.2. 

'^Really we want the values of |u + t;!, which within the positive integers are the 
union of the values oi u + v and —u — v. Alternatively, we can assume that —I € T, 
as is the case for (4.7). 
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Note that in light of (4.15), the minimal dimension 60 in (4.20) can- 
not go below 1/2: the numbers in ^ up to (counted with multiplic- 
ity) have cardinality roughly A^^'^, so if 6 is less than 1/2, then certainly 
a local-global principle fails miserably. (Such a phenomenon appeared 
already in the context of Hensley's conjecture (2.14) in Zaremba's prob- 
lem.) 
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5. The Circle Method: Tools and Proofs 

We briefly review the previous three sections, unifying the (re)formulations 
of the problems. The Apollonian, Pythagorean, and Zaremba Theo- 
rems will henceforth be referred to as Theorem X, where 

X = A,P, or Z, 

respectively. Theorem X concerns the set of numbers of the form 

^= (wo,r-vo). (5.1) 

Here 

the set ^ of curvatures (3.28) if X = A, 

the set ^ of square-roots of sums of 
hypotenuses and even sides (4.22) 
^the set of denominators (2.28) if X = Z, 



if X = P, 



and 



the Apollonian group P if X = A. 

an infinite index subgroup r<A(2) ifX = P, 

the semigroup P_4 if X = Z. 

the root quadruple if X = A 

vq = any coprime pair of opposite parity if X = P 

(0,1)* ifX = Z 



a standard basis vector if X = A, 



wq = { any fixed pair if X = P, 

(0,1)* ifX = Z. 

But now we can forget the individual problems and just focus on the 
general setting (5.1); one need not keep the above taxonomy in one's 
head throughout. 

To study the local-global problem for y, we introduce the represen- 
tation function 

7^A^(r^) := ^ l{„=(wo,7-vo>}- (5-2) 

Here X is a growing parameter, and f^Ar is a certain subset of the radius 
N ball in P, 

^TV C {7 G r : ||7|| < N}, 

which we will describe in more detail later. For now, one can just think 
of Q]^ as the whole radius N ball. To get our bearings, let us recall 
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roughly the size of Qn. It is convenient to introduce the parameter a, 
defined by 

{S, the dimension of an Apollonian packing if X = A, see (3.4) 

25, where 5 is the dimension of ^(F) if X = P, see (4.13) 

25_4, where 5^ is the dimension of if X = Z, see (2.10). 

In each case a satisfies 

1 < « < 2. (5.3) 
Then the cardinality of such a ball is roughly 







if X 


= A, see 


(3.5) 


#{7 G r : hll < iV} X 1 




if X 


= P, see 


(4.15) 






if X 


= Z, see 


(2.16) 



(Technically the quoted results are about counting in the corresponding 
orbits O and not in the groups F; but the order of magnitude is the 
same for both.) We can write this uniformly by giving the cardinality 
of f^AT as 

in^l X X". (5.4) 

Returning to (5.2), we see by construction that 7^ a? is nonnegative. 
Moreover observe that 

if 7^Ar(n) > 0, then certainly n is represented. (5.5) 

Also record that 

7lj\f is supported on n of size \n\ <^ X. (5.6) 

Recalling the notation e{x) = e^'^*^, the Fourier transform 

s^ie) := n^{e) = J2'^N{n)e{ne) 

= ^ e(e(wo,7-vo)) (5.7) 

is a wildly oscillating exponential sum on the circle M/Z = [0, 1), whose 
graph looks something like Figure 12. One recovers TZn through ele- 
mentary Fourier inversion, 

7^Jv(n) = [ SN{0)e{-ne)d9, (5.8) 

JR/Z 

but without further ingredients, one is going around in circles (no pun 
intended) . 

Building on the Hardy-Ramanujan technique for asymptotics of the 
partition function. Hardy and Littlewood had the idea that the bulk of 
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the integral (5.8) could be captured just by integrating over frequencies 
9 that are very close to rational numbers a/ q, (a, q) = 1, with very small 
denominators q; some of these intervals are shaded in Figure 12. These 
are now called the major arcs VJt; the name refers not to their total 
length (they comprise an ever-shrinking fraction of the circle M/Z) but 
to the fact that they are supposed to account for a preponderance of 
lZN{n). Accordingly, we decompose (5.8) as 

7^iv(n) = MNin) + ENin), 

where the major arc contribution 

MN{n) := [ SN{0)e{-ne)de (5.9) 

is supposed to give the "main" term, and 

£^{n) := I SN{9)e{-n9)de (5.10) 

J m 

should be the "error". Here m := [0,1) \ 971 are the complemenary 
so-called minor arcs. If Ai^in) is positive and bigger than \£]y{n)\, 
then certainly 

7^iv(n) > MNin) - \£Nin)\ > 0, (5.11) 



42 



ALEX KONTOROVICH 



SO again, n is represented. In practice, one typically tries to prove an 
asymptotic formula (or at least a lower bound) for A^at, and then give 
an upper bound for \£n\- 

The reason for this decomposition is that exponential sums such as 
Sn should be mostly supported on Tl, having their biggest peaks and 
valleys at (or very near) these frequencies (some of this phenomenon is 
visible in Figure 12). Indeed, the value 6' = is as big as will ever 
get, 

|5^(^^)l<5^(o) = |^^ivl, (5.12) 

which follows trivially (and is thus called the trivial bound) from the 
triangle inequality: every summand in (5.7) is a complex number of ab- 
solute value 1. Also for other 9 G VJt, 6 ^ a/q, the summands should 
all point in a limited number of directions, colluding to give a large 
contribution to S^. As we will see later, at these frequencies, one is 
in a sense measuring the distribution of (or equivalently Qn) along 
certain arithmetic progressions. This strategy of coaxing out the (con- 
jectural) main term for TZn works in surprisingly great generality, but 
can also give false predictions (even for the Prime Number Theorem, 
see e.g. [Gra95]). 

Having made this decomposition, we should determine what we ex- 
pect for the main term. From (5.7), we have that 

n 

SO recalling the support (5.2) of IZ^, one might expect that an ad- 
missible number of size about n x A^ is represented roughly |i7jv|/^ 
times. In particular, since every admissible number is expected to be 
represented, one would like to show, say, for A^/2 < n < N, that 

A^;v(n)»6(n)i^. (5.13) 

Here &{n) > is a certain product of local densities called the singular 
series, and will be discussed at greater length later. It alone is respon- 
sible for the notion of admissibility, vanishing on non-admissible n. For 
admissible n, it typically does not fluctuate too much; crudely one can 
show in many contexts the lower bound ^ A^""^ for any e > 0. For ease 
of exposition, let us just pretend for now that every n is admissible 
and remove the role of the singular series, allowing ourselves to assume 
that 

6(n) = 1. (5.14) 
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Observe also that, in light of (5.4) and (5.3), the lower bound in (5.13) 
is of the order N°'~^, with a > 1. That is, there should be quite a lot of 
representations of an admissible N large, giving further indication 
that every sufficiently large admissible number may be represented. 

One is then left with the problem of estimating away the remainder 
term E^q, and this is why (as Peter Sarnak likes to say) the circle method 
is a "method" and not a "theorem": establishing such estimates is 
much more of an art than a science. The Hardy-Littlewood procedure 
suggests somehow exploiting the fact that on the minor arc frequencies, 
6 E xn, the exponential sum iSat in (5.7) should itself already be quite 
small, being a sum of canceling phases. If one could indeed prove at 
the level of individual n an upper bound for the error term £n which is 
asymptotically smaller than the lower bound (5.13) for A4.Ni then one 
could immediately conclude that every sufficiently large admissible n 
is represented. Unfortunately, at present we do not know how to give 
such strong upper bounds on the minor arcs. 

Instead, we settle for an "almost" local-global statement, by proving 
a sharp bound not for individual n, but for n in an average sense, as 
follows. Parseval's theorem states that the norm of a function is 
equal to that of its Fourier transform, that is, the Fourier transform is 
a unitary operator on these Hilbert spaces. Using the definition (5.10), 
Parseval's theorem then gives 

Y^\SN{n)\'= I \SNm^de. (5.15) 

Inserting our trivial bound (5.12) for into the above yields a trivial 
bound for (5.15) of 

SN{e)\^de <\vtN\^. (5.16) 

We claim that it suffices for our apphcations to establish a bound of 
the form 

jjSN{e)?de = o{^^y (5.17) 

That is, the above saves a little more than on average over m off 
of each term Sn relative to the trivial bound (5.16). We first explain 
why this suffices. 

Let CB(iV) be the set of exceptional n (those that are admissible but 
not represented) in the range N/2 < n < N. Recalling (5.11), the 
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number of exceptions is bounded by 



{\£N{n)\>MNin)}- 



N/2<\n\<N 
n is admissible 



For admissible n, we have the major arc lower bound (5.13) and recall 
our simphfying assumption (5.14); thus 

l{|£-Ar(n)|>|Qjv|/7V}- (5.18) 

n 

Here is a pleasant (standard) trick: for those n contributing a 1 rather 
than to (5.18), we have 

\^n\/N' 

both sides of which may be squared. Hence (5.18) implies that 

Now we apply Parseval (5.15) and the bound (5.17) which we had 
claimed would suffice. This gives 

and thus 100% of the admissible numbers in the range [A^/2,A^) are 
represented. Combining such dyadic intervals, we conclude that al- 
most every admissible number is represented. 

Now "all" that is left is to establish the major arcs bound (5.13) and 
the error bound (5.17). In the next two subsections, we focus individ- 
ually on the tools needed to prove these claims. 



5.1. The Major Arcs. 

Recall that A^at in (5.9) is an integral over the major arcs 6 G SOT, 
with 6 very close to a fraction a/q, with q "small" (the meaning of 
which is explained below). Also let us pretend for now that is just 
the whole F-ball, 

Qj^ = {^ eV: II7II < N}. 
We begin by trying to evaluate (5.7) at 6* = a/q: 

||7ll<iV 
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1 ±2 

1 
1 
±4 1 



:x x(±4x + 1) 



-1 



Figure 13. An expander; shown with q = 101 



An important observation in the above is that the summation may be 
grouped according to the residue class mod q of the integer (wq, 7 ■ vq) . 
Or what is essentially the same, we can decompose the sum according 
to the the residue class of 7(modg). To this end, let Tg = r(modg) be 
the set of such residue classes (which we have already studied in the 
context of admissibility and strong approximation). Then we split the 
sum as 



where we have artificially multiplied and divided by the cardinality of 
Qn- Now for 70 fixed, the bracketed term is then measuring the "prob- 
ability" that 7 = 7o(modg). As one may suspect, our groups do not 
have particular preferences for certain residue classes over others; that 
is, this probability becomes equidistributed as N grows, with q also 
allowed to grow, but at a much slower rate. (In fact, this is exactly 
what we mean by the denominator q being "small" - relative to - 
in the major arcs OJl.) To explain how this happens, we briefly discuss 
the notion of an expander. 

Rather than going into the general theory (for which we refer the 
reader to the beautiful survey [Lubl2]; see also [Sar04]), we content 
ourselves with but one illustrative example of expansion. Figure 13 
shows the following graph. For q = 101, say, take the vertices to be the 




{7=70 (mod (j)} 



(5.19) 
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elements of TLjqL^ organized around the unit circle by placing x G 'LjqL 
at e{x/q). For the edges, connect each 

X to X ± 2, and also to x{±Ax + (5.20) 

when inversion (modg) is possible. This is nothing more than the 
fractional linear action (see (4.12)) of the generating matrices in (4.7) 
(and their inverses) on Z/gZ. We first claim that our graph on q vertices 
is "sparse" . Indeed, the complete graph (connecting any vertex to any 
other) has on the order of edges, whereas our graph has only on the 
order of q edges (since (5.20) implies that any vertex is connected to at 
most four others). So we have square-root the total number of possible 
edges, and our graph is indeed quite sparse. 

Despite having few edges, it is a fact that this graph is nevertheless 
highly connected, in the sense that a random walk on it is rapidly mix- 
ing. Moreover, this rate of mixing, properly normalized, is independent 
of the choice of q above. That is, by varying g, we in fact have a whole 
family of such sparse but highly connected graphs, and with a uniform 
mixing rate; this is exactly what characterizes an expander. 

Proofs of expansion use, among other things, tools from additive 
combinatorics, in particular, the so-called sum-product [BKT04, Bou08] 
and triple-product [Hel08, BGTIO, PSIO] estimates, and quite a lot 
of other work (see e.g. [SX91, Gam02, BG08, BGSIO, VarlO, BVll, 
SGVll]). Once one proves uniform expansion for such finite graphs, 
the statements must be converted into the archimedean form needed 
for the bracketed term in (5.19). To handle such counting statements, 
one uses 

infinite volume spectral and representation theory 
a la §3.1.6, specifically Vinogradov's thesis [Vinl2], if X = A, 

similar techniques developed by '{ X P 

< Bourgain-K.-Sarnak [BKSIO], ^ ~ ' 



the thermodynamic formalism, analytically continuing 
certain Ruelle transfer operators [LalSQ, Dol98, Nau05] 
and their "congruence" extensions; see [BGSll], 

Without going into details, the upshot is that, up to acceptable er- 
ror, the bracketed term in (5.19) is just l/lFgl, confirming the desired 
equidistribution. Inserting this estimation into M.jq m. (5.9), one uses 
these techniques and some more standard circle method analysis to 
eventually conclude (5.13). 
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5.2. The Minor Arcs. 

We use different strategies to prove (5.17) for the Pythagorean and 
Zaremba settings X = P or Z, versus the Apollonian setting X = A, 
so we present them individually. 

5.2.1. Pythagorean and Zaremba settings. 

To handle the minor arcs here, we make the observation that the 
ensemble fi^v in the definition of Sn from (5.7) need not be a full F- 
ball. In fact, the definition of Sn can be changed to, say, 

Sn{0):= J2 Y1 e(^^(vo7i72,Wo)), (5.21) 

71 er 72 Gr 

ll7lll<\/iV ||72ll<^ 

without irreparably damaging the major arcs analysis. This new sum 
encodes much more of the (semi)group structure of F, while preserving 
the property (5.5), where TZn is redefined by (5.8). (In reality, we use 
even more complicated exponential sums.) The advantage of (5.21) 
is that we can now exploit this structure a la Vinogradov's method 
[Vin37] for estimating bilinear forms. Just one such maneuver is the 
following. 

Apply the Cauchy-Schwarz inequality to (5.21) in the 71 variable, 
estimating 
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1/2 



Notice in the second appearance of 71, we have replaced the thin 
and mysterious group F (or semigroup Ta) by the full ambient group 
SL2(Z). On one hand, this allows us to now use more classical tools to 
get the requisite cancellation in the minor arcs integral. On the other 
hand, this type of perturbation argument only succeeds when S is near 
1, explaining the restrictions (2.22) and (4.20). 

5.2.2. The Apollonian case. 

The above strategy fails for the Apollonian problem, because the 
Hausdorff dimension (3.4) is a fixed invariant which refuses to be ad- 
justed to suit our needs. Instead, we recall that the Apollonian group 
F contains the special (arithmetic) subgroup S from (3.35). Then, like 
(5.21), we change the definition of the exponential sum to something 
of the form 

esH 7er 
U\\<x 1I7IKT 
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for certain parameters X and T chosen optimally in relation to A^. 
One uses the full sum over the group V to capture the major arcs and 
admissibility conditions. For the minor arcs bound, one keeps 7 fixed 
and uses the classical arithmetic group S to get sufficient cancellation 
to prove the desired bound (5.17). 
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